Machine-Learning for Spammer Detection in Crowd-Sourcing

نویسندگان

  • Harry Halpin
  • Roi Blanco
چکیده

Over a series of evaluation experiments conducted using naive judges recruited and managed via Amazon’s Mechanical Turk facility using a task from information retrieval (IR), we show that a SVM shows itself to have a very high accuracy when the machine-learner is trained and tested on a single task and that the method was portable from more complex tasks to simpler tasks, but not vice versa.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Active Learning and Crowd-Sourcing for Machine Translation

In recent years, corpus based approaches to machine translation have become predominant, with Statistical Machine Translation (SMT) being the most actively progressing area. Success of these approaches depends on the availability of parallel corpora. In this paper we propose Active Crowd Translation (ACT), a new paradigm where active learning and crowd-sourcing come together to enable automatic...

متن کامل

Scaling Up Crowd-Sourcing to Very Large Datasets: A Case for Active Learning

Crowd-sourcing has become a popular means of acquiring labeled data for many tasks where humans are more accurate than computers, such as image tagging, entity resolution, and sentiment analysis. However, due to the time and cost of human labor, solutions that rely solely on crowd-sourcing are oŸen limited to small datasets (i.e., a few thousand items). is paper proposes algorithms for integrat...

متن کامل

Active Learning for Crowd-Sourced Databases

Crowd-sourcing has become a popular means of acquiring labeled data for many tasks where humans are more accurate than computers, such as image tagging, entity resolution, or sentiment analysis. However, due to the time and cost of human labor, solutions that solely rely on crowd-sourcing are often limited to small datasets (i.e., a few thousand items). This paper proposes algorithms for integr...

متن کامل

Annotating biomedical ontology terms in electronic health records using crowd-sourcing

Electronic health records have been adopted by many institutions and constitute an important source of biomedical information. Text mining methods can be applied to this type of information to automatically extract useful knowledge. We propose a crowd-sourcing pipeline to improve the precision of extraction and normalization of biomedical terms. Although crowd-sourcing has been applied in other...

متن کامل

Crowd-Sourced AI Authoring with ENIGMA

ENIGMA is an experimental platform for collaborative authoring of the behaviour of autonomous virtual characters in interactive narrative applications. The main objective of this system is to overcome the bottleneck of knowledge acquisition that exists in generative storytelling systems through a combination of crowd-sourcing and machine learning. While the authoring front-end of the applicatio...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 2012